Comparing Boolean and Probabilistic Information Retrieval Systems across Queries and Disciplines

نویسنده

  • Robert M. Losee
چکیده

Whether using Boolean queries or ranking documents using document and term weights will result in better retrieval performance has been the subject of considerable discussion among document retrieval system users and researchers. We suggest a method that allows one to analytically compare the two approaches to retrieval and examine their relative merits. The performance of information retrieval systems may be determined either by using experimental simulation, or through the application of analytic techniques that directly estimate the retrieval performance, given values for query and database characteristics. Using these performance predicting techniques, sample performance figures are provided for queries using the Boolean and and or, as well as for probabilistic systems assuming statistical term independence or term dependence. The variation of performance across sublanguages (used in different academic disciplines) and queries is examined. The performance of models failing to meet statistical and other assumptions is examined.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Improved Skips for Faster Postings List Intersection

Information retrieval can be achieved through computerized processes by generating a list of relevant responses to a query. The document processor, matching function and query analyzer are the main components of an information retrieval system. Document retrieval system is fundamentally based on: Boolean, vector-space, probabilistic, and language models. In this paper, a new methodology for mat...

متن کامل

Public Transport Ontology for Passenger Information Retrieval

Passenger information aims at improving the user-friendliness of public transport systems while influencing passenger route choices to satisfy transit user’s travel requirements. The integration of transit information from multiple agencies is a major challenge in implementation of multi-modal passenger information systems. The problem of information sharing is further compounded by the multi-l...

متن کامل

Using Structured Queries for Disambiguation in Cross-Language Information Retrieval

Bilingual transthr dictionaries are an important resource for query translation in cross-language text retrieval. However, term translation is not an isomorphic process, so dictionary-based systems must address the problem of ambiguity in language translation. In this paper, we claim that boolea~l conjunction (the AND operator) provides siml)le and automatic disambiguation in the target languag...

متن کامل

Effective Information Retrieval Method Based on Matching Adaptive Genetic Algorithm

Information Retrieval (IR) System is very complex in nature due to the complex interactions between documents and queries, which means that the matching of document representations and query representations is not straightforward. The Genetic Algorithm (GA) is widely used in IR systems to improve the effectiveness such systems. This study uses the Vector Space Model (VSM) and the Extended Boole...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • JASIS

دوره 48  شماره 

صفحات  -

تاریخ انتشار 1997